emotional speech recognition and emotion identification in farsi language

نویسندگان

davood gharavian

چکیده

speech emotion can add more information to speech in comparison to available textual information. however, it will also lead to some problems in speech recognition process. in a previous study, we depicted the substantial changes of speech parameters caused by speech emotion. therefore, in order to improve emotional speech recognition rate, in a first step, the effects of emotion on speech parameters should be evaluated and in the next steps, emotional speech recognition accuracy be improved through application of suitable parameters. the changes in speech parameters, i.e. formant frequencies and pitch frequency, due to anger and grief were evaluated for farsi language in our former research. in this research, using those results, we try to improve emotional speech recognition accuracy using baseline models. we show that adding parameters such as formant and pitch frequencies to the speech feature vector can improve recognition accuracy. the amount of improvement depends on parameter type, number of mixture components and the emotional condition. proper identification of emotional condition can also help in improving speech recognition accuracy. to recognize emotional condition of speech, formant and pitch frequencies were used successfully in two different approaches, namley decision tree and gmm.

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Recognition of Emotional Speech and Speech Emotion in Farsi

Speech emotion can add extra information to speech in comparison with available textual information. However, it can also lead to some problems in the automatic speech recognition process. We evaluated the changes in speech parameters, i.e. formant frequencies and pitch frequency, due to anger and grief for Farsi language in a former research. Here, using those results, we try to improve emotio...

متن کامل

Enhancing Multilingual Recognition of Emotion in Speech by Language Identification

We investigate, for the first time, if applying model selection based on automatic language identification (LID) can improve multilingual recognition of emotion in speech. Six emotional speech corpora from three language families (Germanic, Romance, Sino-Tibetan) are evaluated. The emotions are represented by the quadrants in the arousal/valence plane, i. e., positive/negative arousal/valence. ...

متن کامل

Speech emotion recognition in emotional feedback for Human-Robot Interaction

For robots to plan their actions autonomously and interact with people, recognizing human emotions is crucial. For most humans nonverbal cues such as pitch, loudness, spectrum, speech rate are efficient carriers of emotions. The features of the sound of a spoken voice probably contains crucial information on the emotional state of the speaker, within this framework, a machine might use such pro...

متن کامل

Data Pre-processing in Emotional Speech Synthesis by Emotion Recognition

Synthesizing emotional speech by means of conversion from neutral speech allows us to generate emotional speech from many existing Text-to-Speech (TTS) systems. How much of the target emotion can be portrayed by the generated speech is largely dependent on the emotion data used to train the mapping function for voice transformation. In this paper, we introduce a method to pre-process the emotio...

متن کامل

Emotion attribute projection for speaker recognition on emotional speech

Emotion is one of the important factors that cause the system performance degradation. By analyzing the similarity between channel effect and emotion effect on speaker recognition, an emotion compensation method called emotion attribute projection (EAP) is proposed to alleviate the intraspeaker emotion variability. The use of this method has achieved an equal error rate (EER) reduction of 11.7%...

متن کامل

Emotion Identification for Evaluation of Synthesized Emotional Speech

In this paper, we propose to evaluate the quality of emotional speech synthesis by means of an automatic emotion identification system. We test this approach using five different parametric speech synthesis systems, ranging from plain non-emotional synthesis to full re-synthesis of pre-recorded speech. We compare the results achieved with the automatic system to those of human perception tests....

متن کامل

منابع من

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

عنوان ژورنال:

the modares journal of electrical engineering

ناشر: tarbiat modares university

ISSN 2228-527 X

دوره 8

شماره 1 2008

کلمات کلیدی

emotion recognition prosody speech emotion speech recognition

میزبانی شده توسط پلتفرم ابری doprax.com